Efficient Maximum Margin Clustering via Cutting Plane Algorithm
نویسندگان
چکیده
Maximum margin clustering (MMC) is a recently proposed clustering method, which extends the theory of support vector machine to the unsupervised scenario and aims at finding the maximum margin hyperplane which separates the data from different classes. Traditionally, MMC is formulated as a non-convex integer programming problem and is thus difficult to solve. Several methods have been proposed in the literature to solve the MMC problem based on either semidefinite programming or alternative optimization. However, these methods are time demanding while handling large scale datasets and therefore unsuitable for real world applications. In this paper, we propose the cutting plane maximum margin clustering (CPMMC) algorithm, to solve the MMC problem. Specifically, we construct a nested sequence of successively tighter relaxations of the original MMC problem, and each optimization problem in this sequence could be efficiently solved using the constrained concave-convex procedure (CCCP). Moreover, we prove theoretically that the CPMMC algorithm takes time O(sn) to converge with guaranteed accuracy, where n is the total number of samples in the dataset and s is the average number of non-zero features, i.e. the sparsity. Experimental evaluations on several real world datasets show that CPMMC performs better than existing MMC methods, both in efficiency and accuracy.
منابع مشابه
Optimal Margin Distribution Clustering
Maximum margin clustering (MMC), which borrows the large margin heuristic from support vector machine (SVM), has achieved more accurate results than traditional clustering methods. The intuition is that, for a good clustering, when labels are assigned to different clusters, SVM can achieve a large minimum margin on this data. Recent studies, however, disclosed that maximizing the minimum margin...
متن کاملM3IC: Maximum Margin Multiple Instance Clustering
Clustering, classification, and regression, are three major research topics in machine learning. So far, much work has been conducted in solving multiple instance classification and multiple instance regression problems, where supervised training patterns are given as bags and each bag consists of some instances. But the research on unsupervised multiple instance clustering is still limited . T...
متن کاملMinimum Conditional Entropy Clustering: A Discriminative Framework for Clustering
In this paper, we introduce an assumption which makes it possible to extend the learning ability of discriminative model to unsupervised setting. We propose an informationtheoretic framework as an implementation of the low-density separation assumption. The proposed framework provides a unified perspective of Maximum Margin Clustering (MMC), Discriminative k -means, Spectral Clustering and Unsu...
متن کاملLearning Data Manifolds with a Cutting Plane Method
We consider the problem of classifying data manifolds where each manifold represents invariances that are parameterized by continuous degrees of freedom. Conventional data augmentation methods rely upon sampling large numbers of training examples from these manifolds; instead, we propose an iterative algorithm called MCP based upon a cutting-plane approach that efficiently solves a quadratic se...
متن کاملTighter and Convex Maximum Margin Clustering
Maximum margin principle has been successfully applied to many supervised and semi-supervised problems in machine learning. Recently, this principle was extended for clustering, referred to as Maximum Margin Clustering (MMC) and achieved promising performance in recent studies. To avoid the problem of local minima, MMC can be solved globally via convex semi-definite programming (SDP) relaxation...
متن کامل